library(bayesrules)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(rstan)
## Loading required package: StanHeaders
## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
## 
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
## 
##     extract
library(rstanarm)
## Loading required package: Rcpp
## This is rstanarm version 2.21.1
## - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
## - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
## - For execution on a local, multicore CPU with excess RAM we recommend calling
##   options(mc.cores = parallel::detectCores())
## 
## Attaching package: 'rstanarm'
## The following object is masked from 'package:rstan':
## 
##     loo
library(bayesplot)
## This is bayesplot version 1.8.1
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
##    * Does _not_ affect other ggplot2 plots
##    * See ?bayesplot_theme_set for details on theme setting
library(tidybayes)
library(janitor)
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(broom.mixed)

Exercise 9.1

  1. A normal regression model is a reasonable choice for a prior model for B0 and B1 because these values can live anywhere along the real line–they can be negative, positive, or zero.

  2. Sigma must be positive, therefore, an exponential model is more appropriate.

  3. Weakly informative priors reflect general uncertainty about the model parameters from the outset (at the prior stage). Weakly informative priors reflect uncertainty across a range of sensible parameter values. Vague priors might be so vague as to put weight on non-sensible parameter values (e.g. in the temperature example, assuming an unrealistically high value for B1–say, 1 million more bike rides for a degree increase in temperature). In this sense, weakly informative priors are more efficient than vague priors, since they restrict the range of B0/B1 values explored to a sensible range.

Exercise 9.2

  1. X=arm length; Y=height

  2. Y=annual CO2 emissions; X= distance between home and work

  3. X=age; Y=vocabulary

  4. X=sleep habits; Y=reaction time

Exercise 9.3

  1. B0 is the height at age 0 months; B1 should be positive (grow taller with age)

  2. B0 is the number of github commits in the past week when they have 0 followers; B1 should be positive (more commits==more productive/collaborative==>more followers)

  3. B0 is the number of inches of rainfall at the start of the day (time==0); B1 should be negative (as rainfall increases, fewer visitors come)

  4. B0 is the number of hours a person sleeps when they watch no Netflix; B1 should be negative (more Netflix, less sleep)

Exercise 9.4

The bigger the sigma, the weaker the relationship between X and Y (greater sigma==greater variability in correlation between X and Y).

Exercise 9.5

  1. Y = annual orange juice consumption; X = age (in years)

  2. \[ Yi| \mu, \sigma \text{~} N(\mu, \sigma^2)\]

  3. \[ Yi| \beta0, \beta1, \sigma \text{~} N(\mu i, \sigma^2) \text{ with } \mu i = \beta0 + \beta1(Xi)\]

  4. The unknown parameters are B0, B1, and sigma, and can take on the following values:

\[ \beta0 \text{ ~ } N(m0, s0)\]

\[ \beta1 \text{ ~ } N(m1, s1)\]

\[ \sigma \text{ ~ } Exp(l) \]

  1. Since B0 is the amount of orange juice consumed per year at the age of 0 years, we’ll say the mean for B0 is 0 with a standard deviation of 0.001 (since it would be unlikely for a zero year old to consume very much juice).

\[ \beta0 \text{ ~ } N(0, 0.001) \]

B1 is the amount of orange juice consumed per year for each additional year. Therefore, we will estimate that the increase in juice consumption per year with each year increase in age is about 0.25, with a small degree of variance (0.001).

\[ \beta1 \text{ ~ } N(0.25, 0.001)\] Sigma expresses the strength of the relationship between X and Y. Since not everyone likes orange juice, I hypothesize some significant variability in how much increases in age affect consumption levels. A mean sigma of 0.75 yields an l of 4/3.

\[ \sigma \text{ ~ } Exp(1.33)\] We can plot the distribution of each of our parameters as follows:

plot_normal(mean = 0, sd = 0.001) + 
  labs(x = "beta_0c", y = "pdf")

plot_normal(mean = 0.25, sd = 0.001) + 
  labs(x = "beta_1", y = "pdf")

plot_gamma(shape = 1, rate = 1.33) + 
  labs(x = "sigma", y = "pdf")

Exercise 9.6

  1. Y = tomorrow’s high temperature; X = today’s high temperature

  2. \[ Yi| \mu, \sigma \text{~} N(\mu, \sigma^2)\]

  3. \[ Yi| \beta0, \beta1, \sigma \text{~} N(\mu i, \sigma^2) \text{ with } \mu i = \beta0 + \beta1(Xi)\]

  4. The unknown parameters are B0, B1, and sigma, and can take on the following values:

\[ \beta0 \text{ ~ } N(m0, s0)\]

\[ \beta1 \text{ ~ } N(m1, s1)\] \[ \sigma \text{ ~ } Exp(l)\] e. Since B0 is tomorrow’s high temperature when today’s high temperature is 0, we’ll estimate a mean of 0 for B0 and a small standard deviation – 10 (since if the temperature is 0 today, its unlikely that the temperature tomorrow could be that much hotter, and couldn’t conceivably get much colder (0 degrees F is very cold)).

\[ \beta0 \text{ ~ } N(0, 10)\] B1 represents the amount that one degree increase in today’s high temperature translates into in changes in tomorrow’s high temperature. I would guess that this would be close to a 1-to-1 ratio. I’m going to estimate a mean of 1 for B1, with a moderate standard deviation (0.005).

\[ \beta1 \text{ ~ } N(1, 0.005)\] Sigma expresses the strength of the relationship between X and Y. Since often today’s temperature is not the best predictor of tomorrow’s temperature, we’re going to estimate a high average standard deviation (30). This yields an l of 1/30==0.0333.

\[ \sigma \text{ ~ } Exp(0.033)\] ## Exercise 9.7

  1. False: MCMC gives us an approximation of the posterior

  2. True

Exercise 9.8

  1. Bunnies model
bunnies_model <- stan_glm(age ~ height, data = bunnies,
                       family = gaussian,
                       prior_intercept = normal(5000, 1000),
                       prior = normal(100, 40), 
                       prior_aux = exponential(0.0008),
                       chains = 4, iter = 5000*2, seed = 84735)
  1. Songs model
songs_model <- stan_glm(snaps ~ clicks, data = songs,
                       family = gaussian,
                       prior_intercept = normal(5000, 1000),
                       prior = normal(100, 40), 
                       prior_aux = exponential(0.0008),
                       chains = 4, iter = 5000*2, seed = 84735)
  1. Dogs model
dogs_model <- stan_glm(age ~ happiness, data = dogs,
                       family = gaussian,
                       prior_intercept = normal(5000, 1000),
                       prior = normal(100, 40), 
                       prior_aux = exponential(0.0008),
                       chains = 4, iter = 5000*2, seed = 84735)

Exercise 9.9

  1. Full specification of model: \[ Yi| \beta0, \beta1, \sigma \text{~} N(\mu i, \sigma^2) \text{ with } \mu i = \beta0 + \beta1(Xi)\]

\[ priors: \beta0 \text{ ~ } N(5000, 2000) \]

\[ \beta1 \text{ ~ } N(-10, 5) \] \[ \sigma \text{ ~ } Exp(0.0005) \]

Plotting our prior understanding:

plot_normal(mean = 5000, sd = 2000) + 
  labs(x = "beta_0c", y = "pdf")

plot_normal(mean = -10, sd = 5) + 
  labs(x = "beta_1", y = "pdf")

plot_gamma(shape = 1, rate = 0.0005) + 
  labs(x = "sigma", y = "pdf")

  1. First, we’ll download the data
data(bikes)
glimpse(bikes)
## Rows: 500
## Columns: 13
## $ date        <date> 2011-01-01, 2011-01-03, 2011-01-04, 2011-01-05, 2011-01-0…
## $ season      <fct> winter, winter, winter, winter, winter, winter, winter, wi…
## $ year        <int> 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011…
## $ month       <fct> Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan, Jan…
## $ day_of_week <fct> Sat, Mon, Tue, Wed, Fri, Sat, Mon, Tue, Wed, Thu, Fri, Sat…
## $ weekend     <lgl> TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALS…
## $ holiday     <fct> no, no, no, no, no, no, no, no, no, no, no, no, no, yes, n…
## $ temp_actual <dbl> 57.39952, 46.49166, 46.76000, 48.74943, 46.50332, 44.17700…
## $ temp_feel   <dbl> 64.72625, 49.04645, 51.09098, 52.63430, 50.79551, 46.60286…
## $ humidity    <dbl> 80.5833, 43.7273, 59.0435, 43.6957, 49.8696, 53.5833, 48.2…
## $ windspeed   <dbl> 10.749882, 16.636703, 10.739832, 12.522300, 11.304642, 17.…
## $ weather_cat <fct> categ2, categ1, categ1, categ1, categ2, categ2, categ1, ca…
## $ rides       <int> 654, 1229, 1454, 1518, 1362, 891, 1280, 1220, 1137, 1368, …

Now, we’ll build the model:

bike_model <- stan_glm(rides ~ humidity, data = bikes,
                       family = gaussian,
                       prior_intercept = normal(5000, 2000),
                       prior = normal(-10, 5), 
                       prior_aux = exponential(0.0005),
                       chains = 5, iter = 4000*2, seed = 84735, prior_PD = TRUE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 8.8e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.88 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.740943 seconds (Warm-up)
## Chain 1:                0.137786 seconds (Sampling)
## Chain 1:                0.878729 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 4.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.46 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.465112 seconds (Warm-up)
## Chain 2:                0.125621 seconds (Sampling)
## Chain 2:                0.590733 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.577516 seconds (Warm-up)
## Chain 3:                0.102939 seconds (Sampling)
## Chain 3:                0.680455 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.3e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.533822 seconds (Warm-up)
## Chain 4:                0.105953 seconds (Sampling)
## Chain 4:                0.639775 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 1.1e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 5: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.719871 seconds (Warm-up)
## Chain 5:                0.178383 seconds (Sampling)
## Chain 5:                0.898254 seconds (Total)
## Chain 5:

Plotting results

# Trace plots of parallel chains
mcmc_trace(bike_model, size = 0.1)

# Density plots of parallel chains
mcmc_dens_overlay(bike_model)

Checking out posterior summary statistics

tidy(bike_model, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = 0.80)
## # A tibble: 3 × 5
##   term        estimate std.error conf.low conf.high
##   <chr>          <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)  5615.     1955.     3071.    8173.  
## 2 humidity       -9.95      4.91    -16.3     -3.57
## 3 sigma        1394.     1980.      206.    4592.
  1. Convert bike model into a dataframe
# Store the 4 chains for each parameter in 1 data frame
bike_model_df <- as.data.frame(bike_model)

Plot 100 simulated model lines

# 50 simulated model lines
bikes %>%
  add_fitted_draws(bike_model, n = 100) %>%
  ggplot(aes(x = humidity, y = rides)) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15) + 
    geom_point(data = bikes, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

  1. Our prior understanding of the relationship between humidity and rides is that as humidity increases, the number of rides decreases.

Exercise 9.10

  1. Plotting bikes data
# Load and plot data
data(bikes)
ggplot(bikes, aes(x = humidity, y = rides)) + 
  geom_point(size = 0.5) + 
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

The relationship is negative, but not particularly strongly negative (there is a lot of variability, and the slope is low).

  1. No, because the data is so scattered that we need to incorporate a high level of variability in our unknown sigma which makes Bayesian Normal regression a better fit.

Exercise 9.11

  1. Running same model to estimate posterior:
bike_model <- stan_glm(rides ~ humidity, data = bikes,
                       family = gaussian,
                       prior_intercept = normal(5000, 2000),
                       prior = normal(-10, 5), 
                       prior_aux = exponential(0.0005),
                       chains = 5, iter = 4000*2, seed = 84735)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 3.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.388056 seconds (Warm-up)
## Chain 1:                0.354677 seconds (Sampling)
## Chain 1:                0.742733 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.8e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.707358 seconds (Warm-up)
## Chain 2:                0.345584 seconds (Sampling)
## Chain 2:                1.05294 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.532774 seconds (Warm-up)
## Chain 3:                0.405605 seconds (Sampling)
## Chain 3:                0.938379 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.430333 seconds (Warm-up)
## Chain 4:                0.351128 seconds (Sampling)
## Chain 4:                0.781461 seconds (Total)
## Chain 4: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5: 
## Chain 5: Gradient evaluation took 1.6e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5: 
## Chain 5: 
## Chain 5: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 5: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 5: 
## Chain 5:  Elapsed Time: 0.445236 seconds (Warm-up)
## Chain 5:                0.370269 seconds (Sampling)
## Chain 5:                0.815505 seconds (Total)
## Chain 5:
  1. Calculating diagnostics:
neff_ratio(bike_model)
## (Intercept)    humidity       sigma 
##     0.98895     0.98095     0.95275
rhat(bike_model)
## (Intercept)    humidity       sigma 
##    1.000068    1.000136    1.000145

Since the effective sample size ratios are just below 1, the effective sample size ratio is not ideal (better to be above 1). The Rhat values are right around 1, which is ideal.

However, upon exploring the plots (below), the chains appear to be mixing quickly and stabilizing.

Plot results:

# Trace plots of parallel chains
mcmc_trace(bike_model, size = 0.1)

# Density plots of parallel chains
mcmc_dens_overlay(bike_model)

  1. Plotting 100 posterior lines
# Store the 4 chains for each parameter in 1 data frame
bike_model_df <- as.data.frame(bike_model)

Plot 100 simulated model lines

# 50 simulated model lines
bikes %>%
  add_fitted_draws(bike_model, n = 100) %>%
  ggplot(aes(x = humidity, y = rides)) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15) + 
    geom_point(data = bikes, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

The posterior is MUCH more concentrated in terms of B0 (y-intercept around 4000)–in the prior model, y-intercepts ranged from 2500 to 8000. The posterior also has a much more consistent slope (B1) than in the prior model.

Exercise 9.12

  1. Creating a tidy summary of our model:
tidy(bike_model, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = 0.95)
## # A tibble: 4 × 5
##   term        estimate std.error conf.low conf.high
##   <chr>          <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)  4020.      225.     3574.    4460.  
## 2 humidity       -8.44      3.37    -15.1     -1.76
## 3 sigma        1574.       49.5    1482.    1678.  
## 4 mean_PPD     3485.       99.7    3289.    3681.
  1. The posterior sigma represents the amount of variability in the relationship between X and Y after the data has been incorporated into the model. The posterior sigma (1573) is less than the prior hypothesized sigma (2000), meaning the data significantly lowered the range of variability in the X-Y relationship.

  2. The 95% posterior credible interval for humidity ranges from -15 to -1.8. This means that 95% of the time, B1 (the amount of change in rides for a one unit change in humidity, or the slope of the line) will fall between -15 and -1.76.

  3. Yes, because the entire 95% confidence interval falls within the negatives.

Exercise 9.13

  1. Selecting one of the draws from the model and using that to estimate the typical number of riders for 90% humidity days:
first_set <- head(bike_model_df, 1)
first_set
##   (Intercept)  humidity    sigma
## 1    4346.145 -14.14211 1576.984
mu <- first_set$`(Intercept)` + first_set$humidity * 90
mu
## [1] 3073.355

This gives us the new mu; now we can plug it in using the already calculated sigma to predict the number of riders on 90% humidity days.

set.seed(84735)
y_new <- rnorm(1, mean = mu, sd = first_set$sigma)
y_new
## [1] 4125.57
set.seed(84735)
predict_75 <- bike_model_df %>% 
  mutate(mu = `(Intercept)` + humidity*90,
         y_new = rnorm(20000, mean = mu, sd = sigma))

head(predict_75, 4)
##   (Intercept)   humidity    sigma       mu    y_new
## 1    4346.145 -14.142111 1576.984 3073.355 4125.570
## 2    3846.842  -7.067930 1525.479 3210.728 3021.812
## 3    3951.057  -8.151076 1659.455 3217.460 4513.274
## 4    4022.744 -10.588479 1618.032 3069.781 3453.122

The mu column represents the typical ridership on a day that is 90% humidity; y_new represents an approximation of the mu value for tomorrow, an individual 90% humidity day.

  1. Plotting density plots for mu and y_new values.
# Plot the posterior model of the typical ridership on 90 degree days
ggplot(predict_75, aes(x = mu)) + 
  geom_density()

# Plot the posterior predictive model of tomorrow's ridership
ggplot(predict_75, aes(x = y_new)) + 
  geom_density()

The posterior plot for the typical 90% humidity day is much more concentrated (mean ridership has a range of 1000, and a peak density of values of 0.0035). In contrast, the tomorrow posterior model exhibits a range of average ridership values spanning from 0 to 9000, and has a peak density of values of 0.00025. Therefore, this model approximates a much wider range of average ridership values.

  1. Calculating an 80% confidence interval for tomorrow’s riders:
# Simulate a set of predictions
set.seed(84735)
shortcut_prediction <- 
  posterior_predict(bike_model, newdata = data.frame(humidity = 90))
posterior_interval(shortcut_prediction, prob = 0.8)
##        10%      90%
## 1 1260.288 5282.819

This very wide ranging set of values for mean ridership reflects the observations we gleaned from the density plot.

  1. We can plot the density function to confirm that the shortcut prediction model follows our calculations in a and b
# Plot the approximate predictive model
mcmc_dens(shortcut_prediction) +  xlab("predicted ridership on a 90humidity day")

This reflects the same range (0 to 9000) that we observed in the “tomorrow” plot in part b, our individually calculated prediction of the posterior.

Exercise 9.14

  1. I would guess that with higher windspeeds we would observe fewer riders–that is, a negative relationship.

  2. Full specification of model: \[ Yi| \beta0, \beta1, \sigma \text{~} N(\mu i, \sigma^2) \text{ with } \mu i = \beta0 + \beta1(Xi)\]

We’ll estimate that on an average windy day, there will be about 4000 riders, with a sizable degree of variability. \[ \beta0 \text{ ~ } N(4000, 2000) \] We’ll estimate that a single mph change in the windspeed will decrease ridership by 100, but this could range from 60 to 140. \[ \beta1 \text{ ~ } N(-100, 20) \] We will estimate that ridership is highly correlated with wind speed (strong negative relationship). We’ll estimate a standard deviation of 100. \[ \sigma \text{ ~ } Exp(0.01) \] c. Simulating and plotting the priors

bike_model <- stan_glm(rides ~ windspeed, data = bikes,
                       family = gaussian,
                       prior_intercept = normal(4000, 2000),
                       prior = normal(-100, 20), 
                       prior_aux = exponential(0.01),
                       chains = 4, iter = 4000*2, seed = 84735, prior_PD=TRUE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.869765 seconds (Warm-up)
## Chain 1:                0.155163 seconds (Sampling)
## Chain 1:                1.02493 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.560623 seconds (Warm-up)
## Chain 2:                0.102859 seconds (Sampling)
## Chain 2:                0.663482 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.665056 seconds (Warm-up)
## Chain 3:                0.107428 seconds (Sampling)
## Chain 3:                0.772484 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.641354 seconds (Warm-up)
## Chain 4:                0.141653 seconds (Sampling)
## Chain 4:                0.783007 seconds (Total)
## Chain 4:

Plotting simulated lines

# 100 simulated model lines
bikes %>%
  add_fitted_draws(bike_model, n = 100) %>%
  ggplot(aes(x = windspeed, y = rides)) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15) + 
    geom_point(data = bikes, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

  1. Plotting windspeed vs ridership based on the data
ggplot(bikes, aes(x = windspeed, y = rides)) + 
  geom_point(size = 0.5) + 
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

I estimated B0 pretty accurately; my prior B1 is a bit too high (in the data, we see rides decrease from 4000 to 2000 over a 30 windspeed interval; in my model we see a decline as large as 8000 to 0 (though more commonly 4000 to 0) in the same 30 windspeed interval).

Exercise 9.15

Simulating the posterior

bike_model <- stan_glm(rides ~ windspeed, data = bikes,
                       family = gaussian,
                       prior_intercept = normal(4000, 2000),
                       prior = normal(-100, 20), 
                       prior_aux = exponential(0.01),
                       chains = 4, iter = 4000*2, seed = 84735)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 1: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.275745 seconds (Warm-up)
## Chain 1:                0.374057 seconds (Sampling)
## Chain 1:                0.649802 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 3.7e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.37 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 2: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.358416 seconds (Warm-up)
## Chain 2:                0.343802 seconds (Sampling)
## Chain 2:                0.702218 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.9e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 3: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.313416 seconds (Warm-up)
## Chain 3:                0.339169 seconds (Sampling)
## Chain 3:                0.652585 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.7e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 8000 [  0%]  (Warmup)
## Chain 4: Iteration:  800 / 8000 [ 10%]  (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%]  (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%]  (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%]  (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%]  (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%]  (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%]  (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%]  (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%]  (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%]  (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.232754 seconds (Warm-up)
## Chain 4:                0.34602 seconds (Sampling)
## Chain 4:                0.578774 seconds (Total)
## Chain 4:

Plotting simulated lines

# 100 simulated model lines
bikes %>%
  add_fitted_draws(bike_model, n = 100) %>%
  ggplot(aes(x = windspeed, y = rides)) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15) + 
    geom_point(data = bikes, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

The posterior much more clearly resembles the data than my prior estimates. Again, the estimated lines are more strongly concentrated around a mean B0 of 4000, and B1 appears to be much lower than in my prior estimates. We can see this in that rides decline by 2000 over the 0-30 windspeed interval, which reflects the windspeed v ridership plot much more than my prior plot (which demonstrates a much greater (negative) slope (B1)).

Exercise 9.16

  1. Downloading data and simulating the model:
data("penguins_bayes")
pengs <- penguins_bayes %>% drop_na() 
pengs_model <- stan_glm(flipper_length_mm ~ bill_length_mm, data = penguins_bayes,
                       family = gaussian,
                       prior_intercept = normal(200, 25),
                       prior = normal(8, 1), 
                       prior_aux = exponential(0.001),
                       chains = 4, iter = 5000*2, seed = 84735, prior_PD=TRUE)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.152002 seconds (Warm-up)
## Chain 1:                0.139414 seconds (Sampling)
## Chain 1:                0.291416 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.130413 seconds (Warm-up)
## Chain 2:                0.164279 seconds (Sampling)
## Chain 2:                0.294692 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.6e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.154617 seconds (Warm-up)
## Chain 3:                0.140287 seconds (Sampling)
## Chain 3:                0.294904 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.133316 seconds (Warm-up)
## Chain 4:                0.12461 seconds (Sampling)
## Chain 4:                0.257926 seconds (Total)
## Chain 4:
  1. Checking out the prior_summary()
prior_summary(pengs_model)
## Priors for model 'pengs_model' 
## ------
## Intercept (after predictors centered)
##  ~ normal(location = 200, scale = 25)
## 
## Coefficients
##  ~ normal(location = 8, scale = 1)
## 
## Auxiliary (sigma)
##  ~ exponential(rate = 0.001)
## ------
## See help('prior_summary.stanreg') for more details

Full specification of model: \[ Yi| \beta0, \beta1, \sigma \text{~} N(\mu i, \sigma^2) \text{ with } \mu i = \beta0 + \beta1(Xi)\]

\[ \beta0 \text{ ~ } N(200, 25) \] We’ll estimate that a single mph change in the windspeed will decrease ridership by 100, but this could range from 60 to 140. \[ \beta1 \text{ ~ } N(8, 1) \] We will estimate that ridership is highly correlated with wind speed (strong negative relationship). We’ll estimate a standard deviation of 100. \[ \sigma \text{ ~ } Exp(0.001) \] c. Plotting 100 plausible model lines:

# 100 simulated model lines
pengs %>% 
  add_fitted_draws(pengs_model, n = 100) %>%
  ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15) + 
    geom_point(data = pengs, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

  1. My prior was a bit too aggressive in terms of the magnitude of the relationship between bill_length_mm and flipper_length_mm (B1 is too high). Secondly, my B0 (the y intercept, or flipper length at a bill length of 0, was drastically underestimated)– it appears to be closer to 200, whereas my prior estimated it as low as 0.

Exercise 9.17

  1. Plotting relationship between flipper length and bill length:
ggplot(pengs, aes(x = bill_length_mm, y = flipper_length_mm)) + 
  geom_point(size = 0.5) + 
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

  1. Simple normal regression could work here, since the scatter of points around the trendline is fairly narrow (i.e. the relationship is strong). Therefore, we wouldn’t necessarily need an indicator (sigma) to demonstrate the variability in the weakness/strength of the relationship.

Exercise 9.18

  1. Simulating posterior understanding of penguins model
pengs_model <- stan_glm(flipper_length_mm ~ bill_length_mm, data = pengs,
                       family = gaussian,
                       prior_intercept = normal(200, 25),
                       prior = normal(8, 1), 
                       prior_aux = exponential(0.001),
                       chains = 4, iter = 5000*2, seed = 84735)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.206914 seconds (Warm-up)
## Chain 1:                0.35119 seconds (Sampling)
## Chain 1:                0.558104 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.1e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.220728 seconds (Warm-up)
## Chain 2:                0.365477 seconds (Sampling)
## Chain 2:                0.586205 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 2.5e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.190001 seconds (Warm-up)
## Chain 3:                0.35187 seconds (Sampling)
## Chain 3:                0.541871 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.8e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.186329 seconds (Warm-up)
## Chain 4:                0.352118 seconds (Sampling)
## Chain 4:                0.538447 seconds (Total)
## Chain 4:
  1. Plotting posterior model lines
# 100 simulated model lines
pengs %>% 
  add_fitted_draws(pengs_model, n = 100) %>%
  ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
    geom_line(aes(y = .value, group = .draw), alpha = 0.15) + 
    geom_point(data = pengs, size = 0.05)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).

  1. Providing a 90% confidence interval for the posterior model:
tidy(pengs_model, effects = c("fixed", "aux"),
     conf.int = TRUE, conf.level = 0.90)
## # A tibble: 4 × 5
##   term           estimate std.error conf.low conf.high
##   <chr>             <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)      124.       4.68    117.      132.  
## 2 bill_length_mm     1.74     0.106     1.57      1.92
## 3 sigma             10.7      0.420    10.0      11.4 
## 4 mean_PPD         201.       0.825   200.      202.

The bill length coefficient (B1) is 1.745. This means that for every millimeter increase in bill length, flipper length is expected to increase by 1.745 on average.

  1. Yes; the entire 90% confidence interval for bill length coefficient is positive, meaning that the relationship between bill length and flipper length is positive.

Exercise 9.19

  1. Selecting one of the draws from the model and using that to estimate the typical flipper length for 51 mm bill length:
# Store the 4 chains for each parameter in 1 data frame
peng_model_df <- as.data.frame(pengs_model)
first_set <- head(peng_model_df, 1)
first_set
##   (Intercept) bill_length_mm    sigma
## 1    129.8397       1.611475 10.90916
mu <- first_set$`(Intercept)` + first_set$bill_length_mm * 51
mu
## [1] 212.0249

This gives us the new mu; now we can plug it in to generate a new distribution of flipper lengths that incorporates a set 51 mm bill length.

set.seed(84735)
y_new <- rnorm(1, mean = mu, sd = first_set$sigma)
y_new
## [1] 219.3039
set.seed(84735)
predict_51 <- peng_model_df %>% 
  mutate(mu = `(Intercept)` + bill_length_mm*51,
         y_new = rnorm(20000, mean = mu, sd = sigma))

head(predict_51, 4)
##   (Intercept) bill_length_mm    sigma       mu    y_new
## 1    129.8397       1.611475 10.90916 212.0249 219.3039
## 2    129.3282       1.624780 11.67715 212.1920 210.7459
## 3    121.7963       1.808393 10.96255 214.0244 222.5847
## 4    126.1526       1.691419 10.29443 212.4150 214.8539

The mu column represents the typical flipper length for a penguin with a bill length of 51 mm; y_new represents an approximation of the mu value for Pablo, an individual penguin with a bill length of 51 mm.

  1. Plotting density plots for mu and y_new values.
ggplot(predict_51, aes(x = mu)) + 
  geom_density()

ggplot(predict_51, aes(x = y_new)) + 
  geom_density()

The “typical” model is much more densely concentrated, with a peak density of 0.4. The Pablo model is much less densely concentrated, with a peak density that is one tenth of the former (0.04). Accordingly, the spread of the “typical” model (the range of flipper lengths represented in the model) is much narrower than that of Pablo’s model (typical ranges from 210 to 216, Pablo ranges from 180 to 240).

  1. Predicting 80% confidence interval for Pablo flipper length
predict_51 %>% 
  summarize(mean = mean(y_new),
            lower_80 = quantile(y_new, 0.1),
            upper_80 = quantile(y_new, 0.9))
##       mean lower_80 upper_80
## 1 213.1735 199.6036 227.0211
  1. The 80% confidence interval for the “typical” model would be narrower, because, as aforementioned, the data in the typical model is more densely concentrated.

  2. Using posterior_predict() to confirm results

# Simulate a set of predictions
set.seed(84735)
shortcut_prediction <- 
  posterior_predict(pengs_model, newdata = data.frame(bill_length_mm = 51))


posterior_interval(shortcut_prediction, prob = 0.8)
##        10%      90%
## 1 199.6036 227.0211

We get the same 80% confidence interval using the posterior_prediction() shortcut as we do for Pablo’s model.

Exercise 9.20

  1. The researchers anticipate that for each 1 mm increase in bill length, flipper length will increase by 0.01 mm. They estimate that the distribution of this coefficient (B1) will vary with standard deviation 0.002 (relatively minimal variability). They therefore hypothesize a weakly positive relationship (small, positive slope), with relatively low deviance from the mean of 0.01.

  2. Plotting the relationship

ggplot(pengs, aes(x = body_mass_g, y = flipper_length_mm)) + 
  geom_point(size = 0.5) + 
  geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

The relationship between bill length and body mass is positive and fairly strongly concentrated (as is visible in the concentration of points around the trend line).

  1. I would guess that sigma is smaller for X=bill_length_mm. Based on the above plot, and the earlier plot of bill_length vs flipper_length, the relationship is much less scattered for X=body_mass_g than X=bill_length_mm.

  2. Simulating model

pengs_model <- stan_glm(flipper_length_mm ~ body_mass_g, data = pengs,
                       family = gaussian,
                       prior_intercept = normal(200, 50),
                       prior = normal(0.01, 0.002), 
                       prior_aux = exponential(0.0001),
                       chains = 4, iter = 5000*2, seed = 84735)
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 2.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.228427 seconds (Warm-up)
## Chain 1:                0.408507 seconds (Sampling)
## Chain 1:                0.636934 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 2.4e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.218268 seconds (Warm-up)
## Chain 2:                0.358882 seconds (Sampling)
## Chain 2:                0.57715 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.5e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.197079 seconds (Warm-up)
## Chain 3:                0.349151 seconds (Sampling)
## Chain 3:                0.54623 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 2.2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.196132 seconds (Warm-up)
## Chain 4:                0.354876 seconds (Sampling)
## Chain 4:                0.551008 seconds (Total)
## Chain 4:
  1. Plotting the posterior of B1
# Density plots of parallel chains
mcmc_dens_overlay(pengs_model, pars=c("body_mass_g"))

The most plausible average for body_mass_g is 0.015, which is a bit higher than the researchers hypothesized. Additionally, 1 sd is 0.001 (half of what the researchers hypothesized). Therefore, the variance is also lower than the researchers expected.